ADF Data Description Developer's Guide v1.5.3 RF

The Allotrope Data Format (ADF) [[!ADF]] consists of several APIs and taxonomies. This document constitutes the Developer's Guide for the ADF Data Description API (ADF-DD) [[!ADF-DD]]. It provides examples on how to use the ADF-DD API to store meta data of the Data Package and the Data Cubes along with experimental or process data and contextual meta data. ADF-DD is based on semantic web standards and linked data concepts using the RDF Data Model.

Disclaimer

THESE MATERIALS ARE PROVIDED "AS IS" AND ALLOTROPE EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, WITHOUT LIMITATION, THE WARRANTIES OF NON-INFRINGEMENT, TITLE, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

This document is part of a set of specifications on the Allotrope Data Format [[!ADF]]


Introduction

The Allotrope Data Format (ADF) defines an interface for storing scientific observations from analytical chemistry. It is intended for long-term stability of archived analytical data and fast real-time access to it. The ADF Data Description API defines an interface for storing business and technical meta data. This Developer's Guide provides examples on how to use the ADF Data Description API to store meta data of Data Package and Data Cubes along with experimental or process data and contextual meta data.

The document is structured as follows: First, the core operations on the RDF graph that are offered by the ADF-DD API for querying and modifying data are explained along examples. Namely the query, insert, delete and update operations are explained. Then reference to a complete example application of the API is given.

Document Conventions

Namespaces

Within this specification, the following namespace prefix bindings are used:

Prefix Namespace
owl:http://www.w3.org/2002/07/owl#
rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs:http://www.w3.org/2000/01/rdf-schema#
xsd:http://www.w3.org/2001/XMLSchema#
skos:http://www.w3.org/2004/02/skos/core#
dct:http://purl.org/dc/terms/
foaf:http://xmlns.com/foaf/0.1/
adf-dp:http://purl.allotrope.org/ontologies/datapackage#
adf-dc:http://purl.allotrope.org/ontologies/datacube#
af-r:http://purl.allotrope.org/ontologies/result#
af-x:http://purl.allotrope.org/ontologies/property#

Number Formatting

Within this document, decimal numbers will use a dot "." as the decimal mark.

RDF Graph Operations

The Data Description API provides functions for the core operations on the RDF graph - i.e. the data stored in the ADF Triple Store.

The API of Apache Jena [[APACHE-JENA]] constitutes the ADF Data Description API. In the following sections, the API is introduced and illustrated by examples. For a complete description of the API, please consult the JavaDoc API documentation and the sources listed in the References section.

The RDF Graph

In Apache Jena, an RDF graph [[!rdf11-concepts]] is called a model and is represented by the Model interface [[APACHE-JENA]], [[JENA-INTRO]]. The following sections illustrate how the Jena Model can be used for operations on the RDF graph. For more details on the Jena API, please refer to [[APACHE-JENA]] and [[JENA-INTRO]].

Querying an RDF Graph

The RDF graph can be queried by iterating over the Jena Model or by running SPARQL queries on the Model. The following sections illustrate both approaches.

Iterating over the RDF Graph

The Model interface of Apache Jena contains a listStatements() method that returns an interator (a StmtIterator, which is a subtype of Java's Iterator) that allows to iterate over all statements in the model. The Statement interface provides accessor methods to the subject, predicate and object of a statement.

The following example illustrates iterating over the model:

Java:
// Iterate over all RDF Statements in the DataDescription
for (Statement stmt : dataDescription.listStatements()) {

	// get the subject
	Resource subject = stmt.getSubject();

	// get the predicate
	Property predicate = stmt.getPredicate();

	// get the object
	RDFNode object = stmt.getObject();
}

C#:
// Iterate over all RDF Statements in the DataDescription
StmtIterator iter = dataDescription.listStatements();
while (iter.MoveNext())
{
	Statement stmt = iter.Current;

	// get the subject
	Resource subject = stmt.getSubject();

	// get the predicate
	Property predicate = stmt.getPredicate();

	// get the object
	RDFNode @object = stmt.getObject();
}
				

Querying the RDF Graph via SPARQL

This section describes how to query an RDF graph via a SPARQL query.

Example Query

The following picture shows a total ion chromatogram and the spectra that belong to it. Assume that we want to query the data highlighted in the following figure:

Query example

Representation of the Query in SPARQL

To keep it simple, the example uses human readable identifiers surrounded by guillemets (« ») instead of artificial identifiers.

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX qudt: <http://qudt.org/schema/qudt#>
PREFIX af-r: <http://purl.allotrope.org/ontologies/result#>
PREFIX af-x: <http://purl.allotrope.org/ontologies/propert#>
SELECT ?scan ?scanTimeValue ?scanTimeUnit ?totalIonCurrent ?basePeakPosValue ?basePeakPosUnit ?basePeakHeightValue ?basePeakHeightUnit
WHERE {
	?spectrum a «af-r:MS1 spectrum» ;
	«af-x:total ion current» ?totalIonCurrent ;
	«af-x:base peak» ?basePeak .
	?basePeak a «af-r:peak» ;
	«af-x:total retention time» [
		qudt:numericValue ?basePeakPosValue ;
		qudt:unit ?basePeakPosUnit ;
	] ;
	«af-x:intensity» [
		qudt:numericValue ?basePeakHeightValue ;
		qudt:unit ?basePeakHeightUnit ;
	] ;
	?scan a «af-x:scan» ;
	«af-x:scanned spectrum» ?spectrum ;
	«af-x:start time» [
		qudt:numericValue ?scanTimeValue ;
		qudt:unit ?scanTimeUnit ;
	] .
}					

A short query with artificial identifiers could look like this:

PREFIX qudt: <http://qudt.org/schema/qudt#>
PREFIX af-r: <http://purl.allotrope.org/ontologies/result#>
PREFIX af-x: <http://purl.allotrope.org/ontologies/propert#>
SELECT ?spectrum ?basePeakHeightValue ?basePeakHeightUnit
WHERE {
	?spectrum a af-r:AFR_0000472 ;
	af-x:AFX_0000361 [
		qudt:numericValue ?basePeakHeightValue ;
		qudt:unit ?basePeakHeightUnit ;
	] .
}					

Execution of the SPARQL Query

Given a SPARQL query as queryString and a Jena Model, we can execute the query on the model as follows, using the Jena ARQ API [[JENA-ARQ]]:

JAVA:
Query query = QueryFactory.create(queryString);
try (QueryExecution qexec = QueryExecutionFactory.create(query, dataDescription)) {
	ResultSet resultSet = qexec.execSelect();
	while (resultSet.hasNext()) {
		QuerySolution solution = resultSet.nextSolution();
		// extract the queried information via solution.get("variable name")
		// and do something with the result
	}
}

C#:
Query query = QueryFactory.create(queryString);
QueryExecution qexec = QueryExecutionFactory.create(query, dataDescription);
try
{
	ResultSet results = qexec.execSelect();
	while (results.MoveNext())
	{
		QuerySolution solution = resultSet.Current;
		// extract the queried information via solution.getResource("variable name");
		// and do something with the result
	}
}
finally
{
	qexec.close();
}
					

In this example, we first create a Query out of the given query string. Then we execute it and finally iterate over the results.

Resuming the example from above, the execution of the query looks like this:

JAVA:
// Iterate over the ResultSet
while (resultSet.hasNext()) {
	QuerySolution soln = resultSet.nextSolution();

	// Get the Scan Time, the Total Ion Current and the Base Peak of the spectrum
	String scanTime = soln.get("scanTimeValue") + " " + soln.get("scanTimeUnit");
	String tic = soln.get("tic");
	String basePeakMZ = soln.get("basePeakPosValue") + " " + soln.get("basePeakPosUnit");
	String basePeakint = soln.get("basePeakHeightValue") + " " + soln.get("basePeakHeightUnit");
}

C#:
// Iterate over the ResultSet
while (results.MoveNext())
{
	com.hp.hpl.jena.query.QuerySolution soln = results.Current;

	// Get the Scan Time, the Total Ion Current and the Base Peak of the spectrum
	String = soln.get("scanTimeValue") + " " + soln.get("scanTimeUnit");
	String tic = soln.get("tic");
	String basePeakMZ = soln.get("basePeakPosValue") + " " + soln.get("basePeakPosUnit");
	String basePeakint = soln.get("basePeakHeightValue") + " " + soln.get("basePeakHeightUnit");
}
					

Please refer to [[JENA-ARQ]] for more details and additional examples on SPARQL queries with the Jena ARQ API.

Modifying Statements in an RDF Graph

Inserting Statements into an RDF Graph

The Jena Model that represents the RDF graph consists of a set of statements. A statement is a triple that consists of subject, predicate and object. Statements can be constructed and inserted into the Jena Model via createResource() and addProperty() in fluent API style:

JAVA and C#:
Resource myBalance = dataDescription.createResource(<myBalanceURI>) //
	.addLiteral(DCTerms.title, "My analytical balance that ...");
				

In this example, we added the triple

<myBalanceURI> dc:title 'My analytical balance that ...'

to the model, using the Dublin Core 'title' predicate. The namespace declaration for Dublin Core (dc) can be found in the Namespaces section.

Updating a Statement in the RDF Graph

Given a data cube and its representation in an RDF graph, assume we have a typo in the label of the dimension representing the intensity. The following SPARQL 1.1 Update [[!sparql11-update]], [[SEARBORNE-SPARQL]] query illustrates how this typo can be fixed:

PREFIX adf-dc: <http://purl.allotrope.org/adf/dc/1.0#>
PREFIX qb: <http://purl.org/linked-data/cube#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

DELETE { ?dimension rdfs:label 'intensty' }
INSERT { ?dimension rdfs:label 'intensity' }
WHERE
  {
	?dimension rdfs:label 'intensty'.
	?dimension rdfs:type qb:DimensionProperty .
  }
				

Given the Jena Model of the RDF graph, we can execute this query using the Jena ARQ API [[JENA-ARQ]] to fix the typo:

JAVA and C#:
UpdateAction.parseExecute(queryString, dataDescription);
				

Removing a Statement from the RDF Graph

With SPARQL 1.1 Update, it is possible to remove statements from an RDF graph by using the DELETE operation (cp. the example of the previous section). The following query removes all archive information packages (AIP), whose retention period of 50 years has expired on 2014-12-05.

PREFIX core: <http://purl.allotrope.org/core/metadata#>

DELETE { ?aip  ?p ?v }
WHERE
  {
	?aip core:retentionTime ?date .
	FILTER ( ?date < "1964-12-05T00:00:00-02:00"^^xsd:dateTime )
	?aip ?p ?v?
  }
				

Given the Jena Model of the RDF graph, we can execute this query using the Jena ARQ API [[JENA-ARQ]]:

JAVA and C#:
UpdateAction.parseExecute(queryString, dataDescription);
				

Complete Example

The ADF-DD First Steps Example Application illustrates the Java API of ADF-DD by one complete code example. It is contained in the file FirstSteps.java in the package org.allotrope.adf.dd.firststeps.

Change History

Version Release Date Remarks
0.4.0 2015-06-29
  • Initial Working Draft version
1.0.0 RC 2015-09-17
  • Renamed document from User Manual to Developer's Guide
  • Renamed section Example Code to Complete Example
  • Removed unnecessary sub section from section Complete Example
  • Added provenance information to the Complete Example
1.0.0 2015-09-29
  • Updated versions, dates and document status
  • Updated introduction
  • Removed code from section Complete Example
1.1.0 RC 2016-03-11
  • Updated versions, dates and document status
  • Added section on number formatting to document conventions
  • Added information and examples for C#/.NET
1.1.0 RF 2016-03-31
  • Updated versions, dates and document status
1.1.5 2016-05-13
  • Updated versions and dates
1.2.0 Preview 2016-09-23
  • Updated versions and dates
1.2.0 RC 2016-12-07
  • Updated versions and dates
1.3.0 Preview 2017-03-31
  • Updated versions and dates
  • Updated section 2.3.3 (Example 9)
1.3.0 RF 2017-06-30
  • Updated versions and dates
1.4.3 RC 2018-10-11
  • Updated versions and dates
1.4.5 RF 2018-12-17
  • Updated versions and dates
1.5.0 RC 2019-12-12
  • Updated versions and dates
1.5.0 RF 2020-03-03
  • Updated HDF5 reference link
1.5.3 RF 2020-11-30
  • Updated broken reference links
  • Updated PURL and DOCS server links to relative links
  • Reformat the document header